239 research outputs found

    Automated Recognition of Brain Region Mentions in Neuroscience Literature

    Get PDF
    The ability to computationally extract mentions of neuroanatomical regions from the literature would assist linking to other entities within and outside of an article. Examples include extracting reports of connectivity or region-specific gene expression. To facilitate text mining of neuroscience literature we have created a corpus of manually annotated brain region mentions. The corpus contains 1,377 abstracts with 18,242 brain region annotations. Interannotator agreement was evaluated for a subset of the documents, and was 90.7% and 96.7% for strict and lenient matching respectively. We observed a large vocabulary of over 6,000 unique brain region terms and 17,000 words. For automatic extraction of brain region mentions we evaluated simple dictionary methods and complex natural language processing techniques. The dictionary methods based on neuroanatomical lexicons recalled 36% of the mentions with 57% precision. The best performance was achieved using a conditional random field (CRF) with a rich feature set. Features were based on morphological, lexical, syntactic and contextual information. The CRF recalled 76% of mentions at 81% precision, by counting partial matches recall and precision increase to 86% and 92% respectively. We suspect a large amount of error is due to coordinating conjunctions, previously unseen words and brain regions of less commonly studied organisms. We found context windows, lemmatization and abbreviation expansion to be the most informative techniques. The corpus is freely available at http://www.chibi.ubc.ca/WhiteText/

    โ€œGuilt by Associationโ€ Is the Exception Rather Than the Rule in Gene Networks

    Get PDF
    Gene networks are commonly interpreted as encoding functional information in their connections. An extensively validated principle called guilt by association states that genes which are associated or interacting are more likely to share function. Guilt by association provides the central top-down principle for analyzing gene networks in functional terms or assessing their quality in encoding functional information. In this work, we show that functional information within gene networks is typically concentrated in only a very few interactions whose properties cannot be reliably related to the rest of the network. In effect, the apparent encoding of function within networks has been largely driven by outliers whose behaviour cannot even be generalized to individual genes, let alone to the network at large. While experimentalist-driven analysis of interactions may use prior expert knowledge to focus on the small fraction of critically important data, large-scale computational analyses have typically assumed that high-performance cross-validation in a network is due to a generalizable encoding of function. Because we find that gene function is not systemically encoded in networks, but dependent on specific and critical interactions, we conclude it is necessary to focus on the details of how networks encode function and what information computational analyses use to extract functional meaning. We explore a number of consequences of this and find that network structure itself provides clues as to which connections are critical and that systemic properties, such as scale-free-like behaviour, do not map onto the functional connectivity within networks

    Meta-Analysis of Kindling-Induced Gene Expression Changes in the Rat Hippocampus

    Get PDF
    Numerous studies have been performed to examine gene expression patterns in the rodent hippocampus in the kindling model of epilepsy. However, recent reviews of this literature have revealed limited agreement among studies. Because this conclusion was based on retrospective comparison of reported โ€œhit listsโ€ from individual studies, we hypothesized that re-analysis of the original expression data would help address this concern. In this paper, we reanalyzed four genome-wide expression studies of excitotoxin-induced kindling in rat and performed a statistical meta-analysis. The meta-analysis revealed over 800 genes which show significant change in expression 24โ€‰h after initial seizure induction, and 59 genes altered after 10โ€‰days. To evaluate our results in light of previous work, we assembled a reference list of genes formed from a consensus of the published literature. Our profiles include most of the genes in this reference list, and most of the additional genes are from pathways or biological processes previously recognized to be altered in kindling. In addition our results emphasized expression changes in lipid metabolism and protein degradation pathways. We conclude that a cautious re-analysis of published expression data can help illuminate genes and pathways underling kindling. Supplementary Material is available at http://www.chibi.ubc.ca/faculty/pavlidis/meta-analysis-of-brain-kindling

    Progress and challenges in the computational prediction of gene function using networks

    Get PDF

    Analysis of strain and regional variation in gene expression in mouse brain

    Get PDF
    BACKGROUND: We performed a statistical analysis of a previously published set of gene expression microarray data from six different brain regions in two mouse strains. In the previous analysis, 24 genes showing expression differences between the strains and about 240 genes with regional differences in expression were identified. Like many gene expression studies, that analysis relied primarily on ad hoc 'fold change' and 'absent/present' criteria to select genes. To determine whether statistically motivated methods would give a more sensitive and selective analysis of gene expression patterns in the brain, we decided to use analysis of variance (ANOVA) and feature selection methods designed to select genes showing strain- or region-dependent patterns of expression. RESULTS: Our analysis revealed many additional genes that might be involved in behavioral differences between the two mouse strains and functional differences between the six brain regions. Using conservative statistical criteria, we identified at least 63 genes showing strain variation and approximately 600 genes showing regional variation. Unlike ad hoc methods, ours have the additional benefit of ranking the genes by statistical score, permitting further analysis to focus on the most significant. Comparison of our results to the previous studies and to published reports on individual genes show that we achieved high sensitivity while preserving selectivity. CONCLUSIONS: Our results indicate that molecular differences between the strains and regions studied are larger than indicated previously. We conclude that for large complex datasets, ANOVA and feature selection, alone or in combination, are more powerful than methods based on fold-change thresholds and other ad hoc selection criteria

    Integration of Neuroimaging and Microarray Datasets through Mapping and Model-Theoretic Semantic Decomposition of Unstructured Phenotypes

    Get PDF
    An approach towards heterogeneous neuroscience dataset integration is proposed that uses Natural Language Processing (NLP) and a knowledge-based phenotype organizer system (PhenOS) to link ontology-anchored terms to underlying data from each database, and then maps these terms based on a computable model of disease (SNOMED CTยฎ). The approach was implemented using sample datasets from fMRIDC, GEO, The Whole Brain Atlas and Neuronames, and allowed for complex queries such as โ€œList all disorders with a finding site of brain region X, and then find the semantically related references in all participating databases based on the ontological model of the disease or its anatomical and morphological attributesโ€. Precision of the NLP-derived coding of the unstructured phenotypes in each dataset was 88% (n = 50), and precision of the semantic mapping between these terms across datasets was 98% (n = 100). To our knowledge, this is the first example of the use of both semantic decomposition of disease relationships and hierarchical information found in ontologies to integrate heterogeneous phenotypes across clinical and molecular datasets

    Experimental comparison and cross-validation of the Affymetrix and Illumina gene expression analysis platforms

    Get PDF
    The growth in popularity of RNA expression microarrays has been accompanied by concerns about the reliability of the data especially when comparing between different platforms. Here, we present an evaluation of the reproducibility of microarray results using two platforms, Affymetrix GeneChips and Illumina BeadArrays. The study design is based on a dilution series of two human tissues (blood and placenta), tested in duplicate on each platform. The results of a comparison between the platforms indicate very high agreement, particularly for genes which are predicted to be differentially expressed between the two tissues. Agreement was strongly correlated with the level of expression of a gene. Concordance was also improved when probes on the two platforms could be identified as being likely to target the same set of transcripts of a given gene. These results shed light on the causes or failures of agreement across microarray platforms. The set of probes we found to be most highly reproducible can be used by others to help increase confidence in analyses of other data sets using these platforms

    Numerical Modelling of Melt Behaviour in the Lower Vessel Head of a Nuclear Reactor

    Get PDF
    Acknowledgements The authors would like to thank the EPSRC MEMPHIS multi-phase programme grant, the EPSRC Computational modelling for advanced nuclear power plants project and the EU FP7 projects THINS and GoFastR for helping to fund this work.Peer reviewedPublisher PD

    Numerical Modelling of Debris Bed Water Quenching

    Get PDF
    Acknowledgements The authors would like to thank the EPSRC MEMPHIS multi-phase programme grant, the EPSRC Computational modelling for advanced nuclear power plants project, the EU FP7 projects THINS and GoFastR and ExxonMobil for helping to fund this work.Peer reviewedPublisher PD
    • โ€ฆ
    corecore